NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

HQAlign: aligning nanopore reads for SV detection using current-level modeling

https://doi.org/10.1093/bioinformatics/btad580

Joshi, Dhaivat; Diggavi, Suhas; Chaisson, Mark J; Kannan, Sreeram (October 2023, Bioinformatics)
Alkan, Can (Ed.)
Abstract MotivationDetection of structural variants (SVs) from the alignment of sample DNA reads to the reference genome is an important problem in understanding human diseases. Long reads that can span repeat regions, along with an accurate alignment of these long reads play an important role in identifying novel SVs. Long-read sequencers, such as nanopore sequencing, can address this problem by providing very long reads but with high error rates, making accurate alignment challenging. Many errors induced by nanopore sequencing have a bias because of the physics of the sequencing process and proper utilization of these error characteristics can play an important role in designing a robust aligner for SV detection problems. In this article, we design and evaluate HQAlign, an aligner for SV detection using nanopore sequenced reads. The key ideas of HQAlign include (i) using base-called nanopore reads along with the nanopore physics to improve alignments for SVs, (ii) incorporating SV-specific changes to the alignment pipeline, and (iii) adapting these into existing state-of-the-art long-read aligner pipeline, minimap2 (v2.24), for efficient alignments. ResultsWe show that HQAlign captures about 4%–6% complementary SVs across different datasets, which are missed by minimap2 alignments while having a standalone performance at par with minimap2 for real nanopore reads data. For the common SV calls between HQAlign and minimap2, HQAlign improves the start and the end breakpoint accuracy by about 10%–50% for SVs across different datasets. Moreover, HQAlign improves the alignment rate to 89.35% from minimap2 85.64% for nanopore reads alignment to recent telomere-to-telomere CHM13 assembly, and it improves to 86.65% from 83.48% for nanopore reads alignment to GRCh37 human genome. Availability and implementationhttps://github.com/joshidhaivat/HQAlign.git.
more » « less
Full Text Available
Player-Replaceability and Forensic Support are Two Sides of the Same (Crypto) Coin

Sheng, Peiyao; Wang, Gerui; Nayak, Kartik; Kannan, Sreeram; Viswanath, Pramod (May 2023, Springer)

Player-replaceability is a property of a blockchain protocol that ensures every step of the protocol is executed by an unpredictably random (small) set of players; this guarantees security against a fully adaptive adversary and is a crucial property in building permissionless blockchains. Forensic Support is a property of a blockchain protocol that provides the ability, with cryptographic integrity, to identify malicious parties when there is a safety violation; this provides the ability to enforce punishments for adversarial behavior and is a crucial component of incentive mechanism designs for blockchains. Player-replaceability and strong forensic support are both desirable properties, yet, none of the existing blockchain protocols have both properties. Our main result is to construct a new BFT protocol that is player-replaceable and has maximum forensic support. The key invention is the notion of a ``transition certificate'', without which we show that natural adaptations of extant BFT and longest chain protocols do not lead to the desired goal of simultaneous player-replaceability and forensic support.
more » « less
Full Text Available
Fundamental Limits of Multi-Sample Flow Graph Decomposition

https://doi.org/10.1109/ISIT50566.2022.9834518

Mazooji, Kayvon; Kannan, Sreeram; Noble, William Stafford; Shomorony, Ilan (July 2022, IEEE International Symposium on Information Theory)

Full Text Available
CellMeSH: probabilistic cell-type identification using indexed literature

https://doi.org/10.1093/bioinformatics/btab834

Mao, Shunfu; Zhang, Yue; Seelig, Georg; Kannan, Sreeram (February 2022, Bioinformatics)
Birol, Inanc (Ed.)
Abstract MotivationSingle-cell RNA sequencing (scRNA-seq) is widely used for analyzing gene expression in multi-cellular systems and provides unprecedented access to cellular heterogeneity. scRNA-seq experiments aim to identify and quantify all cell types present in a sample. Measured single-cell transcriptomes are grouped by similarity and the resulting clusters are mapped to cell types based on cluster-specific gene expression patterns. While the process of generating clusters has become largely automated, annotation remains a laborious ad hoc effort that requires expert biological knowledge. ResultsHere, we introduce CellMeSH—a new automated approach to identifying cell types for clusters based on prior literature. CellMeSH combines a database of gene–cell-type associations with a probabilistic method for database querying. The database is constructed by automatically linking gene and cell-type information from millions of publications using existing indexed literature resources. Compared to manually constructed databases, CellMeSH is more comprehensive and is easily updated with new data. The probabilistic query method enables reliable information retrieval even though the gene–cell-type associations extracted from the literature are noisy. CellMeSH is also able to optionally utilize prior knowledge about tissues or cells for further annotation improvement. CellMeSH achieves top-one and top-three accuracies on a number of mouse and human datasets that are consistently better than existing approaches. Availability and implementationWeb server at https://uncurl.cs.washington.edu/db_query and API at https://github.com/shunfumao/cellmesh. Supplementary informationSupplementary data are available at Bioinformatics online.
more » « less
Full Text Available
Interpreting neural networks for biological sequences by learning stochastic masks

https://doi.org/10.1038/s42256-021-00428-6

Linder, Johannes; La Fleur, Alyssa; Chen, Zibo; Ljubetič, Ajasja; Baker, David; Kannan, Sreeram; Seelig, Georg (January 2022, Nature Machine Intelligence)

Full Text Available
A deep adversarial variational autoencoder model for dimensionality reduction in single-cell RNA sequencing analysis

https://doi.org/10.1186/s12859-020-3401-5

Lin, Eugene; Mukherjee, Sudipto; Kannan, Sreeram (December 2020, BMC Bioinformatics)

Full Text Available
Everything is a Race and Nakamoto Always Wins

https://doi.org/10.1145/3372297.3417290

Dembo, Amir; Kannan, Sreeram; Tas, Ertem Nusret; Tse, David; Viswanath, Pramod; Wang, Xuechao; Zeitouni, Ofer (November 2020, CCS '20: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security)
null (Ed.)
Full Text Available
Deepcode: Feedback Codes via Deep Learning

https://doi.org/10.1109/JSAIT.2020.2986752

Kim, Hyeji; Jiang, Yihan; Kannan, Sreeram; Oh, Sewoong; Viswanath, Pramod (May 2020, IEEE Journal on Selected Areas in Information Theory)

Full Text Available
RefShannon: A genome-guided transcriptome assembler using sparse flow decomposition

https://doi.org/10.1371/journal.pone.0232946

Mao, Shunfu; Pachter, Lior; Tse, David; Kannan, Sreeram; Chen, Zhong-Hua (June 2020, PLOS ONE)

Full Text Available
LEARN Codes: Inventing Low-Latency Codes via Recurrent Neural Networks

https://doi.org/10.1109/JSAIT.2020.2988577

Jiang, Yihan; Kim, Hyeji; Asnani, Himanshu; Kannan, Sreeram; Oh, Sewoong; Viswanath, Pramod (May 2020, IEEE Journal on Selected Areas in Information Theory)

Full Text Available

« Prev Next »

Search for: All records